Time Series data are repeated measurements collected over time from the same system/subject
Repeated Measures with MEMs
When the number of repeated measurements is small (e.g. pre/post/follow up for a psychology intervention) we can use Mixed Effect Models!
Repeated Measures with MEMs
# mem with interaction of treatment and timerepeated_mlm <-lmer(Outcome ~ Treatment * Time + (1+ Time | Participant),data = rt_data)summary(repeated_mlm)
Linear mixed model fit by REML ['lmerMod']
Formula: Outcome ~ Treatment * Time + (1 + Time | Participant)
Data: rt_data
REML criterion at convergence: 13724.7
Scaled residuals:
Min 1Q Median 3Q Max
-2.9099 -0.5381 0.0003 0.5243 3.3140
Random effects:
Groups Name Variance Std.Dev. Corr
Participant (Intercept) 57.44040 7.5789
Time 0.05631 0.2373 0.06
Residual 0.96553 0.9826
Number of obs: 3000, groups: Participant, 1000
Fixed effects:
Estimate Std. Error t value
(Intercept) 49.05630 0.34656 141.550
TreatmentTreatment 0.22623 0.48865 0.463
Time 0.04587 0.03293 1.393
TreatmentTreatment:Time 0.33615 0.04644 7.239
Correlation of Fixed Effects:
(Intr) TrtmnT Time
TrtmntTrtmn -0.709
Time -0.152 0.108
TrtmntTrt:T 0.108 -0.152 -0.709
Repeated Measures with MEMs
Time Series
But often, we have many more observations. We can break these time series down into 3 components:
seasonal: variation in value due to repeated patterns over time (e.g. day of week, day of year, day of month, time of day)
trend: changes over time in the value
noise: variation in the value not due to seasonality or trend
Time Series Decomposition
\[
Y_t = Tr_t + S_t + \epsilon_t
\]
Autocorrelation and Partial Autocorrelation
💡 with MCMC samples, we noted autocorrelation in our chains because each sample in the chain was conditioned on the previous sample
correlation: the correlation between \(x1\) and \(y\)
Coefficient Interpretation in GLMs
semi-partial correlation: the correlation between \(x1\) and \(y\), after subtracting the influence of \(x2\) on \(x1\)
Coefficient Interpretation in GLMs
partial correlation: the correlation between \(x1\) and \(y\), after subtracting the influence of \(x2\) on \(x1\) and subtracting the influence of \(x2\) on \(y\)
Partial Autocorrelation
partial autocorrelation: the correlation between \(y_t\) and \(y_{t-2}\), after subtracting the influence of \(y_{t-1}\) on \(y_t\) and subtracting the influence of \(y_{t-1}\) on \(y_{t-2}\)
Partial Autocorrelation
Stationarity of a TS
\(\mu\) is constant
\(\sigma\) is constant
no seasonality (constant autocorrelation)
Stationarity
Stationarity
Stationarity
Terminology: White Noise
\(\mu\) is 0
\(\sigma\) is constant
no autocorrelation
Are Non-Stationary Time Series Useless?
This time series is not stationary, is there anything we can do?
\[
x_t = \beta_0 + \beta_1*t + \epsilon_t
\]
Are Non-Stationary Time Series Useless?
💡 what if we used \(x_t\) to create a new time series, \(z_t\)?
Thus, the current value predicted based on the error (white noise components) of previous time steps. You use your past error to help improve your current prediction.
\(ARIMA(p,d,q)\) refers to a model with an Auto Regressive component of \(p\), an integrated component of \(d\), and a Moving Average component of \(q\)
ARIMA
\[
x_t = \beta_0 + \beta_1*t + \epsilon_t
\]
💡 what if we used \(x_t\) to create a new time series, \(z_t\)?
where \(l\) and \(\sigma\) are hyperparameter that you choose.
and I spend so much of my day choosing them…istg
Kernels
We could use \(k(x,x')\) to “fill out” the covariance matrix \(\Sigma\) which tells us how each data point influences (or covaries with) each other data point.
Kernel Simulation
Let’s say we want to look at 100 data points at \(t = 0, t = 1, ...., t = 99\). The kernel gives us information about how these 100 time points covary.
Using the squared exponential kernel we talked about before, we can generate a few different 100-point time series with:
\(\mu = 0\) (for convenience)
\(\Sigma =\sigma^2 e^{- \frac{(x-x')^2}{2l}}\)
Kernel Simulation
Knowing \(\mu\) and \(\Sigma\) we can sample from a \(\mathcal{N}(0,\Sigma)\) distribution.
Kernel Simulation
Conditioning on Observed Data
Alone, the kernel acts as a prior that encodes our assumptions about the shape of the GP. Now, we incorporate the observed data by conditioning on it.
Conditioning on Observed Data
Seasonal GP Kernels
Gaussian Processes in R
library(kernlab)# Generate synthetic time series dataset.seed(540)time <-seq(0, 10, length.out =100)y <-sin(time) +rnorm(100, sd =0.2) # Fit a Gaussian Process model using the radial basis function kernelgp_model <-gausspr(time, y, kernel ="rbfdot", var =0.1, variance.model =TRUE)# Predict using the fitted Gaussian Process modeltime_test <-seq(0, 14, length.out =1000)y_pred <-predict(gp_model, time_test)y_up <- y_pred +2*predict(gp_model, time_test, type="sdeviation")y_low <- y_pred -2*predict(gp_model, time_test, type="sdeviation")
Gaussian Processes in R
Using automatic sigma estimation (sigest) for RBF or laplace kernel